Dataset-2

Valentine’s Day Spending



gifts_age <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-02-13/gifts_age.csv")
Rows: 6 Columns: 9
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Age
dbl (8): SpendingCelebrating, Candy, Flowers, Jewelry, GreetingCards, Evenin...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
glimpse(gifts_age)
Rows: 6
Columns: 9
$ Age                 <chr> "18-24", "25-34", "35-44", "45-54", "55-64", "65+"
$ SpendingCelebrating <dbl> 51, 40, 31, 19, 18, 13
$ Candy               <dbl> 70, 62, 58, 60, 50, 42
$ Flowers             <dbl> 50, 44, 41, 37, 32, 25
$ Jewelry             <dbl> 33, 34, 29, 20, 13, 8
$ GreetingCards       <dbl> 33, 33, 42, 42, 43, 44
$ EveningOut          <dbl> 41, 37, 30, 31, 29, 24
$ Clothing            <dbl> 33, 27, 26, 20, 19, 12
$ GiftCards           <dbl> 23, 19, 22, 23, 20, 20
inspect(gifts_age)

categorical variables:  
  name     class levels n missing                                  distribution
1  Age character      6 6       0 18-24 (16.7%), 25-34 (16.7%) ...             

quantitative variables:  
                 name   class min    Q1 median    Q3 max     mean        sd n
1 SpendingCelebrating numeric  13 18.25   25.0 37.75  51 28.66667 14.733183 6
2               Candy numeric  42 52.00   59.0 61.50  70 57.00000  9.777525 6
3             Flowers numeric  25 33.25   39.0 43.25  50 38.16667  8.886319 6
4             Jewelry numeric   8 14.75   24.5 32.00  34 22.83333 10.870449 6
5       GreetingCards numeric  33 35.25   42.0 42.75  44 39.50000  5.089204 6
6          EveningOut numeric  24 29.25   30.5 35.50  41 32.00000  6.066300 6
7            Clothing numeric  12 19.25   23.0 26.75  33 22.83333  7.359801 6
8           GiftCards numeric  19 20.00   21.0 22.75  23 21.16667  1.722401 6
  missing
1       0
2       0
3       0
4       0
5       0
6       0
7       0
8       0


giftsage_modified <- gifts_age %>%
   mutate(Age = as.factor(Age))
giftsage_modified
# A tibble: 6 × 9
  Age   SpendingCelebrating Candy Flowers Jewelry GreetingCards EveningOut
  <fct>               <dbl> <dbl>   <dbl>   <dbl>         <dbl>      <dbl>
1 18-24                  51    70      50      33            33         41
2 25-34                  40    62      44      34            33         37
3 35-44                  31    58      41      29            42         30
4 45-54                  19    60      37      20            42         31
5 55-64                  18    50      32      13            43         29
6 65+                    13    42      25       8            44         24
# ℹ 2 more variables: Clothing <dbl>, GiftCards <dbl>
Variables Data_types Desc
Age Factor-qual Age group
SpendingCelebrating Float-quant Average spending on celebrations
Candy Float-quant Average spending on candy
Flowers Float-quant Average spending on flowers
Jewelry Float-quant Average spending on jewelry
GreetingCards Float-quant Average spending on greeting cards
EveningOut Float-quant Average spending on evening outings
Clothing Float-quant Average spending on clothing
GiftCards Float-quant Average spending on gift cards


Dependent variable- spending

Independent variable - age

-Relation between age and gift categories

-what are the valentine’s day spending trends across different age groups

What research activity might have been carried out to obtain the data graphed here?

  • surveys

  • reports on consumer spending



data_long <-gifts_age %>%
  pivot_longer(cols = -Age, names_to = "Category", values_to = "AmountSpent") 

ggplot(data_long, aes(x = Age, y = AmountSpent, color = Category, group = Category)) +
  geom_line(size = 1) +  
  geom_point(size = 2) +  
  labs(title = "Spending by Age Group",
       x = "Age Group",
       y = "Amount Spent",
       color = "Spending Category") +
  theme_minimal() +  
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

Pre processing: wide to long where all spending categories are combined in one column which becomes our amount spent.

colnames(giftsage_modified)
[1] "Age"                 "SpendingCelebrating" "Candy"              
[4] "Flowers"             "Jewelry"             "GreetingCards"      
[7] "EveningOut"          "Clothing"            "GiftCards"          
colnames(gifts_age)
[1] "Age"                 "SpendingCelebrating" "Candy"              
[4] "Flowers"             "Jewelry"             "GreetingCards"      
[7] "EveningOut"          "Clothing"            "GiftCards"          

Observations/surprises

-Did not expect spending on jewellery to be high among the younger age brackets.

- had a preconceived notion that the 65+ age group would have considerable amounts of spending in the jewellery and flowers categories

-did not expect candy to be plotted so high up either especially for the older age groups.